Is small the next big thing in AI? Do all AI workloads need GPUs?
As we reach the crucial GA milestone, the final stages of our product launch this mo ...read more
I'm excited to kick off a new blog series called Back to Basics (B2B). The goal is to revisit fundamental concepts that often slip through the cracks ...read more
In today's fast-paced digital world, keeping your data safe and always available is more critical than ever. As enterprises increasingly depend on Kubernetes to deploy and manage their applications, having a robust disaster recovery plan is essential. Enter NetApp ® Trident™ with asynchronous SnapMirror ® volume replication to keep your data secure and your mind at ease. With the release of v25.06, Trident now supports NVMe/TCP back ends for volume replication using SnapMirror, offering faster and more efficient data transfer capabilities. This blog takes you through the exciting journey of setting up asynchronous SnapMirror volume replication using Trident. So buckle up and let’s dive in! 🌊
Why SnapMirror volume replication?
NetApp SnapMirror technology is a game changer for disaster recovery, enabling efficient data replication between NetApp ONTAP ® clusters. With Trident, you can establish mirror relationships between PersistentVolumeClaims (PVCs) on different ONTAP clusters, so that your data is always protected and available. Here’s why you should be excited about these mirror relationships:
Disaster recovery. With SnapMirror. your data is replicated and ready to go in case of any disaster.
Seamless failover. Easily promote secondary volumes during unplanned failovers and perform smooth transitions during planned migrations.
Custom Resource Definition (CRD). Manage mirror relationships effortlessly with the Trident Mirror Relationship (TMR) CRD.
Prerequisites: Ready, set, go!
Before we get started, make sure that you have the following in place:
ONTAP clusters: Both source and destination clusters must have ONTAP SnapMirror asynchronous licenses enabled. Refer to SnapMirror licensing overview in ONTAP for more information. To use Trident SnapMirror Volume Replication with NVMe/TCP, the ONTAP version must be 9.15.1 or later. Create a cron job on the destination cluster for an asynchronous replication schedule, Example command: cron create -name five_minute -minute 0,5,10,15,20,25,30,35,40,45,50,55
Peering: ONTAP clusters are in cluster peering, and the storage virtual machines (SVMs) of the source and destination clusters are in vserver peering.
Trident. Trident version should be 25.06 or later.
Setting up SnapMirror volume replication for NVMe/TCP with a step-by-step example.
Create Trident back end for source cluster. Start by creating a Trident back end with the storage driver ontap-san. You can do this either by configuring the TridentBackendConfig (TBC) custom resource definition using the Kubernetes-native approach or by using a custom JSON file with tridentctl, a command-line utility for managing Trident. The configuration is similar to any other ontap-san NVMe back end, with the only changes being the username and password. You can either use a cluster management IP with administrator credentials or specify a storage virtual machine with its management IP and credentials. For more information, refer to ONTAP SAN driver details. # Kubernetes secret required for creating Trident backend from TBC
[root@scs000646264 artifacts]# cat primary-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: primary-tbc-secret
type: Opaque
stringData:
username: <username>
password: <password>
[root@scs000646264 artifacts]# kubectl create -f primary-secret.yaml -n trident
secret/primary-tbc-secret created
Add the back end. When the back end is configured, you can add it to Trident by using either kubectl or tridentctl. These tools are a convenient way to add the newly configured back end to Trident and make it available for use. # Kubernetes CR TridentBackendConfig (TBC)
[root@scs000646264 artifacts]# cat primary-tbc.yaml---
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
name: primary-backend-tbc
spec:
version: 1
storageDriverName: ontap-san
sanType: nvme
managementLIF: 8.8.8.8
svm: svm0
credentials:
name: primary-tbc-secret
[root@scs000646264 artifacts]# kubectl create -f primary-tbc.yaml -n trident
tridentbackendconfig.trident.netapp.io/primary-backend-tbc created
[root@scs000646264 artifacts]# kubectl get tbc -n trident
NAME BACKEND NAME BACKEND UUID PHASE STATUS
primary-backend-tbc primary-backend-tbc a9b1a3a7-66a8-4d3b-984a-b87d851387c7 Bound Success
# Or, Trident backend json
[root@scs000646264 artifacts]# cat backend.json
{
"version": 1,
"storageDriverName": "ontap-san",
"managementLIF": "8.8.8.8",
"backendName": "primary-backend"
"svm": "svm0",
"username": "<username>",
"password": "<password>",
"sanType": "nvme"
}
[root@scs000646264 artifacts]# tridentctl create b -f primary-backend.json -n trident
+--------------------+----------------+--------------------------------------+--------+------------+---------+
| NAME | STORAGE DRIVER | UUID | STATE | USER-STATE | VOLUMES |
+--------------------+----------------+--------------------------------------+--------+------------+---------+
| primary-backend | ontap-san | 6458337e-a27e-4cde-8707-0f6218214356 | online | normal | 0 |
+--------------------+----------------+--------------------------------------+--------+------------+---------+
Define a storage class. Create a storage class that corresponds to the type of storage driver you need. This step allows you to define the characteristics of the storage that you want to dynamically provision. kind: StorageClass
metadata:
name: primary-sc
provisioner: csi.trident.netapp.io
parameters:
backendType: ontap-san
storagePools: primary-backend-tbc:.*
allowVolumeExpansion: true
root@scs000646264 artifacts]# kubectl create -f primary-sc.yaml
storageclass.storage.k8s.io/primary-sc created
[root@scs000646264 artifacts]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
primary-sc csi.trident.netapp.io Delete Immediate true 2s
Create a PVC. Define a PersistentVolumeClaim that specifies the amount of storage you need and references the appropriate storage class. This step ensures that your Kubernetes application has access to the required block storage. kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: primary-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Mi
storageClassName: primary-sc
[root@scs000646264 artifacts]# kubectl create -f primary-pvc.yaml
persistentvolumeclaim/primary-pvc created
Confirm PVC binding. After the PVC is created, verify that it’s successfully bound to a persistent volume (PV). This step confirms that the block storage is ready for use by your applications. [root@scs000646264 artifacts]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
primary-pvc Bound pvc-bd77f21b-4522-41e9-bfa1-fca7cf8af672 20Mi RWO primary-sc <unset> 4s
Use the PVC. You’re now ready to use the PVC in any pod of your choice. Mount the PVC in a pod to store data. The following example simulates an application workload. This pod runs an Nginx container, writing timestamps to a file on the PVC every 20 seconds. [root@scs000646264 artifacts]# cat pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: source-pod
spec:
containers:
- image: nginx:alpine
name: nginx
volumeMounts:
- mountPath: /pv/pvc
name: local-storage
command: ["/bin/ash", "-c"]
args:
- |
while true; do
echo `date +%Y-%m-%d.%H:%M:%S` >> /pv/pvc/data.txt;
fsync /pv/pvc;
fsync /pv/pvc/data.txt;
tail -n 1 /pv/pvc/data.txt;
sleep 20;
done
nodeSelector:
kubernetes.io/arch: amd64
kubernetes.io/os: linux
volumes:
- name: local-storage
persistentVolumeClaim:
claimName: primary-pvc
[root@scs000646264 artifacts]# kubectl create -f pod.yaml
pod/source-pod created
[root@scs000646264 artifacts]# kubectl get po
NAME READY STATUS RESTARTS AGE
source-pod 1/1 Running 0 15s
[root@scs000646264 artifacts]# kubectl logs source-pod
2025-05-21.07:30:12
2025-05-21.07:30:32
Create source MirrorRelationship custom resource (CR) with local information (localPVCName). Trident fetches the internal information for the volume and the volume’s current data protection (DP) state, then populates the status field of the MirrorRelationship. kind: TridentMirrorRelationship
apiVersion: trident.netapp.io/v1
metadata:
name: source-tmr
spec:
state: promoted
volumeMappings:
- localPVCName: primary-pvc
[root@scs000646264 artifacts]# kubectl create -f source-tmr.yaml
tridentmirrorrelationship.trident.netapp.io/source-tmr created
[root@scs000646264 artifacts]# kubectl get tmr
NAME DESIRED STATE LOCAL PVC ACTUAL STATE MESSAGE
source-tmr promoted primary-pvc promoted
Get the TridentMirrorRelationship CR to obtain the internal name and SVM of the PVC. [root@scs000646264 artifacts]# kubectl get tmr source-tmr -o yaml
apiVersion: trident.netapp.io/v1
kind: TridentMirrorRelationship
metadata:
creationTimestamp: "2025-05-21T07:26:15Z"
finalizers:
- trident.netapp.io
generation: 2
name: source-tmr
namespace: default
resourceVersion: "4573176"
uid: 7d0cb8e2-434a-4c1b-b3a8-8d181f744f82
spec:
replicationPolicy: ""
replicationSchedule: ""
state: promoted
volumeMappings:
- localPVCName: primary-pvc
promotedSnapshotHandle: ""
remoteVolumeHandle: ""
status:
conditions:
- lastTransitionTime: "2025-05-21T07:26:15Z"
localPVCName: primary-pvc
localVolumeHandle: svm0:trident_pvc_bd77f21b_4522_41e9_bfa1_fca7cf8af672
message: ""
observedGeneration: 2
remoteVolumeHandle: ""
replicationPolicy: ""
replicationSchedule: ""
state: promoted
Perform the following steps for the destination cluster: Create Trident back end with a replicationPolicy of MirrorAllSnapshots. # Kubernetes secret required for creating Trident backend from TBC
[root@scs000646264 artifacts]# cat secondary-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: secondary-tbc-secret
type: Opaque
stringData:
username: <username>
password: <password>
[root@scs000646264 artifacts]# kubectl create -f secondary-secret.yaml -n trident
secret/secondary-tbc-secret created
[root@scs000646264 artifacts]# cat secondary-tbc.yaml
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
name: secondary-backend-tbc
spec:
version: 1
storageDriverName: ontap-san
sanType: nvme
managementLIF: 7.7.7.7
svm: svm1
replicationPolicy: MirrorAllSnapshots
replicationSchedule: five_minute
credentials:
name: secondary-tbc-secre
[root@scs000646264 artifacts]# kubectl create -f secondary-tbc-tamu.yaml -n trident
tridentbackendconfig.trident.netapp.io/secondary-backend-tbc created
[root@scs000646264 artifacts]# kubectl get tbc -n trident
NAME BACKEND NAME BACKEND UUID PHASE STATUS
primary-backend-tbc primary-backend-tbc a9b1a3a7-66a8-4d3b-984a-b87d851387c7 Bound Success
secondary-backend-tbc secondary-backend-tbc c4fbde1a-cf6c-4ee4-8474-488454c926d1 Bound Success
# Or, Trident backend json
[root@scs000646264 artifacts]# cat backend.json
{
"version": 1,
"storageDriverName": "ontap-san",
"managementLIF": "7:7:7:7",
"backendName": "secondary-backend"
"svm": "svm0",
"username": "<username>",
"password": "<password>",
"sanType": "nvme"
"replicationPolicy": "MirrorAllSnapshots"
"replicationSchedule": "five_minute"
}
[root@scs000646264 artifacts]# tridentctl create b -f primary-backend.json -n trident
+--------------------+----------------+--------------------------------------+--------+------------+---------+
| NAME | STORAGE DRIVER | UUID | STATE | USER-STATE | VOLUMES |
+--------------------+----------------+--------------------------------------+--------+------------+---------+
| secondary-backend | ontap-san | 2345337e-e67e-4bdm-8707-1d5644326548 | online | normal | 0 |
+--------------------+----------------+--------------------------------------+--------+------------+---------+
Create destination storage class. [root@scs000646264 artifacts]# cat secondary_sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: secondary-sc
provisioner: csi.trident.netapp.io
parameters:
backendType: ontap-san
storagePools: secondary-backend-tbc:.*
allowVolumeExpansion: true
[root@scs000646264 artifacts]# kubectl create -f secondary_sc.yaml
storageclass.storage.k8s.io/secondary-sc created
[root@scs000646264 artifacts]# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
primary-sc csi.trident.netapp.io Delete Immediate true 11s
secondary-sc csi.trident.netapp.io Delete Immediate true 3s
Create destination PVC with previously created StorageClass to act as the secondary (SnapMirror destination). Trident checks for the TridentMirrorRelationship CRD. If the relationship does not exist, Trident fails to create the volume and remains in pending state until the destination TMR is created. If the relationship exists, Trident places the new FlexVol volume onto an SVM that is peered with the remote SVM defined in the MirrorRelationship. [root@scs000646264 artifacts]# cat secondary-pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
annotations:
trident.netapp.io/mirrorRelationship: dest-tmr
name: secondary-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Mi
storageClassName: secondary-sc
[root@scs000646264 artifacts]# kubectl create -f secondary-pvc.yaml
persistentvolumeclaim/secondary-pvc created
[root@scs000646264 artifacts]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
primary-pvc Bound pvc-bd77f21b-4522-41e9-bfa1-fca7cf8af672 20Mi RWO primary-sc <unset> 43s
secondary-pvc Pending secondary-sc <unset> 4s
Create destination TMR CR. Trident creates a SnapMirror relationship with the configured relationship policy name (or default for ONTAP) and initializes it. (Provide the value obtained in “Get the TridentMirrorRelationship CR to obtain the internal name and SVM of the PVC,” earlier in this document, as the remoteVolumeHandle in destination TMR CR.) kind: TridentMirrorRelationship”
apiVersion: trident.netapp.io/v1
metadata:
name: dest-tmr
spec:
state: established
volumeMappings:
- localPVCName: secondary-pvc
remoteVolumeHandle: "svm0:trident_pvc_bd77f21b_4522_41e9_bfa1_fca7cf8af672"
[root@scs000646264 artifacts]# kubectl create -f dest-tmr.yaml
tridentmirrorrelationship.trident.netapp.io/dest-tmr created
[root@scs000646264 artifacts]# kubectl get tmr
NAME DESIRED STATE LOCAL PVC ACTUAL STATE MESSAGE
source-tmr promoted primary-pvc promoted
dest-tmr established secondary-pvc established
Verify the SnapMirror relationship and volume type (DP) on the destination cluster: stiA300-2491746692774::> snapmirror show
Progress
Source Destination Mirror Relationship Total Last
Path Type Path State Status Progress Healthy Updated
----------- ---- ------------ ------- -------------- --------- ------- --------
svm0:trident_pvc_bd77f21b_4522_41e9_bfa1_fca7cf8af672
XDP svm1:trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
Snapmirrored
Idle - true -
stiA300-2491746692774::> volume show -vserver svm1
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
svm1 svm0_root stiA300_250_aggr1
online RW 1GB 972.4MB 0%
svm1 trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
stiA300_249_aggr1
online DP 22MB 20.42MB 7%
2 entries were displayed.
Create snapshot class [root@scs000646264 artifacts]# cat snapshot-class.yaml
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: csi-snapclass
driver: csi.trident.netapp.io
deletionPolicy: Delete
[root@scs000646264 artifacts]# kubectl create -f snapshot-class.yaml
volumesnapshotclass.snapshot.storage.k8s.io/csi-snapclass created
[root@scs000646264 artifacts]# kubectl get volumesnapshotclasses
NAME DRIVER DELETIONPOLICY AGE
csi-snapclass csi.trident.netapp.io Delete 5s
Below is the output for the snapshot show command in the Destination cluster, Before any snapshot is created. stiA300-2491746692774::> snapshot show
---Blocks---
Vserver Volume Snapshot Size Total% Used%
-------- -------- ------------------------------------- -------- ------ -----
svm1 trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
snapmirror.33ffbac2-300c-11f0-8dc5-00a098f48d3e_2155208310.2025-05-21_032847
216KB 1% 40%
snapmirror.33ffbac2-300c-11f0-8dc5-00a098f48d3e_2155208310.2025-05-21_033000
136KB 1% 29%
2 entries were displayed.
Create snapshot apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: "snap1"
spec:
volumeSnapshotClassName: csi-snapclass
source:
persistentVolumeClaimName: primary-pvc
root@scs000646264 artifacts]# kubectl create -f snapshot.yaml
volumesnapshot.snapshot.storage.k8s.io/snap1 created
[root@scs000646264 artifacts]# kubectl get volumesnapshots
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
snap1 true primary-pvc 20Mi csi-snapclass snapcontent-7a3b0380-6d13-4cb9-88a3-df9528be73a9 5s 6s
Output of the snapshot show command in the destination cluster after the first Snapshot copy is taken: stiA300-2491746692774::> snapshot show
---Blocks---
Vserver Volume Snapshot Size Total% Used%
-------- -------- ------------------------------------- -------- ------ -----
svm1 trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
snapmirror.33ffbac2-300c-11f0-8dc5-00a098f48d3e_2155208310.2025-05-21_033000
216KB 1% 19%
snapshot-7a3b0380-6d13-4cb9-88a3-df9528be73a9
268KB 1% 22%
snapmirror.33ffbac2-300c-11f0-8dc5-00a098f48d3e_2155208310.2025-05-21_033740
180KB 1% 16%
3 entries were displayed. Working with more Snapshot copies: apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: "snap1-new"
spec:
volumeSnapshotClassName: csi-snapclass
source:
persistentVolumeClaimName: primary-pvc
root@scs000646264 artifacts]# kubectl create -f snapshot-1.yaml
volumesnapshot.snapshot.storage.k8s.io/snap1-new created
[root@scs000646264 artifacts]# kubectl get vs
NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
snap1 true primary-pvc 20Mi csi-snapclass snapcontent-7a3b0380-6d13-4cb9-88a3-df9528be73a9 6m33s 6m34s
snap1-new true primary-pvc 20Mi csi-snapclass snapcontent-56cf40dc-f968-4aeb-9ffd-ea1f7bcf0942 4s 5s
Snapshots transfer ensures the replication process. stiA300-2491746692774::> snapshot show
---Blocks---
Vserver Volume Snapshot Size Total% Used%
-------- -------- ------------------------------------- -------- ------ -----
svm1 trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
snapshot-7a3b0380-6d13-4cb9-88a3-df9528be73a9
268KB 1% 20%
snapmirror.33ffbac2-300c-11f0-8dc5-00a098f48d3e_2155208310.2025-05-21_033740
276KB 1% 21%
snapshot-56cf40dc-f968-4aeb-9ffd-ea1f7bcf0942
264KB 1% 20%
snapmirror.33ffbac2-300c-11f0-8dc5-00a098f48d3e_2155208310.2025-05-21_034000
272KB 1% 21%
4 entries were displayed.
In Trident, when the state of the destination TMR is established, it indicates that the destination volume is a Data Protection (DP) volume. DP volumes are used for replication and backup, so that your data is protected and can be recovered in case of failure. To verify that the destination volume is indeed a DP volume, you can use the volume show command: stiA300-2491746692774::> volume show
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
stiA300-249
vol0 aggr0_stiA300_249
online RW 348.6GB 291.9GB 11%
stiA300-250
vol0 aggr0_stiA300_250
online RW 348.6GB 291.7GB 11%
svm1 svm0_root stiA300_250_aggr1
online RW 1GB 972.4MB 0%
svm1 trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
stiA300_249_aggr1
online DP 22MB 19.93MB 9%
vs0 root_vs0 stiA300_249_aggr1
online RW 1GB 972.1MB 0%
5 entries were displayed.
Volume replication states
The TMR CRD has the following states to manage the replication relationship:
Established. When the state of destination TMR is established, it indicates that the destination volume is a Data Protection (DP) volume. DP volumes are used for replication and backup, so that your data is protected and can be recovered in case of failure.
Promoted. When the TMR state is promoted, it indicates that the destination volume has been elevated to become the new source volume. This means that the destination volume is now writable and can serve I/O operations, effectively taking over the role of the original source volume.
Reestablished. The TMR state is reestablished after failover when the original source volume is ready to resume its role as the primary volume. This typically occurs in scenarios where the original issue that caused the failover has been resolved or maintenance has been completed, and you want to revert the roles of the source and destination volumes to their original configuration. When the TMR state is reestablished, it indicates that the replication relationship between the source and destination volumes has been restored to its original state.
Promote secondary PVC during an unplanned failover Perform the following step on the secondary Kubernetes cluster: Update the spec.state field of TridentMirrorRelationship to promoted. kind: TridentMirrorRelationship
apiVersion: trident.netapp.io/v1
metadata:
name: dest-tmr
spec:
state: promoted
volumeMappings:
- localPVCName: secondary-pvc
remoteVolumeHandle: "svm0:trident_pvc_bd77f21b_4522_41e9_bfa1_fca7cf8af672"
[root@scs000646264 artifacts]# kubectl apply -f dest-tmr.yaml
tridentmirrorrelationship.trident.netapp.io/dest-tmr configured
[root@scs000646264 artifacts]# kubectl get tmr
NAME DESIRED STATE LOCAL PVC ACTUAL STATE MESSAGE
source-tmr promoted primary-pvc promoted
dest-tmr promoted secondary-pvc promoted
Once the destination TMR is promoted, the destination volume becomes the RW volume with no mirror relationship currently in effect. Check the volume state and snapmirror: stiA300-2491746692774::> snapmirror show
This table is currently empty.
stiA300-2491746692774::> volume show
Vserver Volume Aggregate State Type Size Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
stiA300-249
vol0 aggr0_stiA300_249
online RW 348.6GB 291.9GB 11%
stiA300-250
vol0 aggr0_stiA300_250
online RW 348.6GB 291.7GB 11%
svm1 svm0_root stiA300_250_aggr1
online RW 1GB 972.4MB 0%
svm1 trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
stiA300_249_aggr1
online RW 22MB 19.93MB 9%
vs0 root_vs0 stiA300_249_aggr1
online RW 1GB 972.1MB 0%
5 entries were displayed.
When TMR is in promoted state, you can mount applications with the destination PVC. Test the promoted PVC by mounting it in a pod: apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: nginx:alpine
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /mnt/test-path
name: test-volume
readOnly: false
restartPolicy: Never
volumes:
- name: test-volume
persistentVolumeClaim:
claimName: secondary-pvc
readOnly: false
[root@scs000646264 artifacts]# kubectl create -f test-pod.yaml
[root@scs000646264 artifacts]# kubectl get po
NAME READY STATUS RESTARTS AGE
source-pod 1/1 Running 0 12m
test-pod 1/1 Running 0 8s
Restore a mirror relationship after a failover The TMR state is reestablished after a failover or planned switchover operation when the original source volume is ready to resume its role as the primary volume. kind: TridentMirrorRelationship
apiVersion: trident.netapp.io/v1
metadata:
name: dest-tmr
spec:
state: reestablished
volumeMappings:
- localPVCName: secondary-pvc
remoteVolumeHandle: "svm0:trident_pvc_bd77f21b_4522_41e9_bfa1_fca7cf8af672"
[root@scs000646264 artifacts]# kubectl apply -f dest-tmr.yaml
tridentmirrorrelationship.trident.netapp.io/dest-tmr configured snapmirror show output:
stiA300-2491746692774::> snapmirror show
Progress
Source Destination Mirror Relationship Total Last
Path Type Path State Status Progress Healthy Updated
----------- ---- ------------ ------- -------------- --------- ------- --------
svm0:trident_pvc_bd77f21b_4522_41e9_bfa1_fca7cf8af672
XDP svm1:trident_pvc_32d31b4d_ec91_41cd_a95e_9e74bd6b2d9b
Snapmirrored
Idle - true -
Conclusion: Your data, unstoppable!
With NetApp Trident and SnapMirror, you’ve built a robust disaster recovery pipeline for your Kubernetes applications. Every file—secrets, back ends, storage classes, PVCs, TMRs, Snapshot copies, and pods—plays a critical role in making sure that your data is replicated, protected, and ready for any scenario. Follow the steps in this guide and your clusters will be prepared to handle disasters with confidence. Additionally, check out Trident protect, which helps you to manage SnapMirror and fail over the data, and also to manage the complete application. For more detailed information on using SnapMirror replication with Trident protect, check out the NetApp Trident protect documentation.
Want to dive deeper? Check out the NetApp Trident documentation for more insights. Happy replicating!
... View more
Is small the next big thing in AI? Do all AI workloads need GPUs?
As we reach the crucial GA milestone, the final stages of our product launch this month, I reflected on these two customer-centric questions associated with AI infrastructure costs and the recent developments seen in the world of tokenization and the economics of deploying large language models (LLMs).
When it comes to AI inferencing and LLM deployments – not all AI workloads or business use cases require the powerful capabilities possessed by large language models (LLMs) with model size of 20+ billion parameters. Smaller language models (SLMs) have been emerging as a powerful and practical alternative, especially for tasks that require specific domain expertise and for customers with resource-constrained environments. When fine-tuned with tailored datasets, evidence-based research suggests that SLMs tend to outperform larger general-purpose models in specific tasks and domains, such as medical diagnostics or legal analysis. One of my findings from a recent development is VeriGen, an AI model fine-tuned on an open-source CodeGen-16B language model which contains 16 billion parameters to generate Verilog code, a hardware description language (HDL) used for design automation in the semiconductor and electronics industry. Furthermore, like the CEO of AI startup HuggingFace once suggested – that up to 99% of use cases could be addressed using SLMs. With SLMs demonstrating capabilities along the growing notion “small is the next big thing in AI” leaves many AI engineers with newer architectural considerations for RAG deployment options to choose from.
On the other hand, the choice between CPU vs GPU also depends on the specific requirements of the AI application. While some AI workloads benefit more from the parallel processing capabilities of the GPUs, other customers may prioritize low latency, which CPUs can provide. Moreover, procuring GPUs can sometimes be a challenge leading to hardware availability and supply-chain constraints in addition to budget considerations in an organization, thus slowing customers from reaching the finish line they aspire to achieve in their AI product development and deployment lifecycle. A thorough understanding of the customer’s workload characteristics and performance requirements leads to sound AI design decisions and therefore prevent over-engineering a system design with unnecessary complexities.
The collaboration between NetApp and Intel brings these perspectives and customer pain points into consideration with a product strategy that includes feasibility and viability via the NetApp® AIPod™ Mini, a retrieval-augmented generation (RAG) system designed for air-gapped AI inferencing workloads without the need for GPUs.
Announcing the General Availability of NetApp AIPod Mini
At a time when a growing number of organizations are leveraging RAG applications and LLMs to interpret user prompts. These prompts and responses can include text, code, images, or even therapeutic protein structures retrieved from an organization’s internal knowledge base. RAG accelerates knowledge retrieval and efficient literature review by quickly providing researchers and business leaders with relevant and reliable information.
The NetApp® AIPod™ Mini combines NetApp’s intelligent data infrastructure of NetApp AFF A-Series systems powered by NetApp ONTAP® data management software and compute servers with Intel® Xeon® 6 processors with Intel® Advanced Matrix Extensions (Intel® AMX), Intel AI for Enterprise RAG, and the OPEA software stack.
The NetApp® AIPod™ Mini supports pre-trained models of sizes up to 20 billion parameters (e.g. Llama-13B, DeepSeek-R1-8B, Qwen 14B, Mistral 7B). Intel AMX accelerates AI inferencing on a combination of data types (e.g. INT4, INT8, BF16). The NetApp AIPod Mini, jointly tested by NetApp and Intel using optimization techniques like activation-aware weight quantization (AWQ) for accuracy and speculative decoding for inference speed, delivers up to 2000 input/output tokens for 30+ concurrent users with 500+ tokens per second (TPS), thus balancing the trade-off between speed and accuracy for user-experience. The benchmark results released can be found on the MLPerf Inference 5.0.
Advantages of running RAG system with NetApp AIPod Mini:
NetApp ONTAP data management provides enterprise-grade storage to support various types of AI workloads, including batch and real-time inferencing, and offers the benefits of velocity and scalability to handle large datasets for versioning, data access with multiprotocol support allowing client AI applications to read data using S3, NFS, and SMB file-sharing protocols which can facilitate data access in multimodal LLM inference scenarios and data protection & confidentiality with built-in NetApp Autonomous Ransomware Protection (ARP), offering both software- and hardware-based encryption to enhance confidentiality and security for RAG applications that retrieve knowledge from company’s document repositories.
In addition to a data pipeline powered by NetApp intelligent data infrastructure you get OPEA for Intel® AI for Enterprise RAG. OPEA simplifies transforming your enterprise data into actionable insights. Intel AI for Enterprise RAG extends customers with key features that enhance scalability, security, and user experience. OPEA includes a comprehensive framework featuring LLMs, datastores, prompt engines and RAG architectural blueprints.
RAG systems and LLMs are technologies that work together to provide accurate and context-aware responses retrieved from your organization’s internal knowledge repository. NetApp has been a leader in data management, data mobility, data governance, and data security technologies across the ecosystem of edge, data center, and cloud. The NetApp AIPod Mini delivers an air-gapped RAG inferencing pipeline to help enterprise customers deploy generative AI technologies with significantly less computational power and boost their business productivity.
To learn more about the solution design and validation, please refer to NetApp AIPod Mini for Enterprise RAG Inferencing | NetApp
... View more
Learn more about how you can protect your Kubernetes based containers and virtual machines using NetApp Backup and Recovery. Sign up for the preview to get access to the service!
... View more
Confused about where to start with LLMs? In this post, I will walk through the deployment of a fully-portable LLM inference stack that uses NVIDIA NIM and Google Cloud NetApp Volumes.
... View more